Simulation of aerial images

ABSTRACT

A method for generating a simulated aerial image of a mask projected by an optical system includes determining a coherence characteristic of the optical system. A coherent decomposition of the optical system is computed based on the coherence characteristic. The decomposition includes a series of expansion functions having angular and radial components that are expressed as explicit functions. The expansion functions are convolved with a transmission function of the mask in order to generate the simulated aerial image.

CROSS-REFERENCE TO RELATED APPLICATION

The present patent application is a DIVISIONAL of, claims priority to and incorporates by reference U.S. patent application Ser. No. 10/928,537, filed 27 Aug. 2004, which is related to U.S. patent application Ser. No. 10/928,390, filed 27 Aug. 2004, now U.S. Pat. No. 7,310,796 issued 18 Dec. 2007, entitled “System and Method for Simulating an Aerial Image,” which is assigned to the assignee of the present patent application and whose disclosure is incorporated herein by reference.

FIELD OF THE INVENTION

The present invention relates generally to photolithography, and specifically to simulation of aerial images produced by projecting a mask onto a target surface.

BACKGROUND OF THE INVENTION

Photolithography is an essential tool in reproduction of fine patterns on a substrate, and is very widely used in production of microelectronic devices. As the design rules used in such devices become ever finer, mask designers must increasingly resort to reticle enhancement technologies, such as the use of serifs, assist lines and phase shift masks, in order to project the desired pattern onto the device substrate. The aerial image that is actually formed on the substrate is a complex function of the characteristics of the illumination source and optics that are used in the lithographic process and of diffraction and interference effects caused by the structures on the mask itself. Mask designers need simulation systems that model these effects in order to predict the pattern that will be formed on the substrate by particular arrangements of mask features.

Simulation of the aerial image is complicated by the fact that practical lithography systems use partially-coherent illumination. For optical systems that are nearly paraxial, the intensity of the aerial image at the image plane with partially-coherent illumination of the mask is given by the well-known Hopkins formula:

I({right arrow over (x)})=∫_(S) T({right arrow over (η)})T*({right arrow over (i)})·J({right arrow over (η)}−{right arrow over (i)})·A({right arrow over (η)}−{right arrow over (x)})·A*({right arrow over (η)}−{right arrow over (x)})d{right arrow over (η)}d{right arrow over (i)}  (1)

Here T is the transmission function of the mask; J is the mutual coherence function of the illumination source (typically the Fourier transform of the condenser aperture function); and A is the point spread function (PSF) of the projection system. Formula (I) is a quadruple integral, taken over coordinates {right arrow over (i)}=(i₁, i₂) and {right arrow over (η)}=(η₁, η₂) in the mask plane. Although direct computation of this integral is possible, it becomes intractable for large-scale masks.

In order to reduce the complexity of aerial image simulation, a number of authors have suggested using optimal coherent decomposition (OCD) to approximate the optical properties of the partially-coherent imaging system as an incoherent sum of a finite number of coherent imaging systems. The aerial images that would be formed by each of the coherent imaging systems are then computed and summed together to give the total, simulated aerial image. Pati et al. provide a useful overview of OCD methods in “Exploiting Structure in Fast Aerial Image Computation for Integrated Circuit Patterns,” IEEE Transactions on Semiconductor Manufacturing 10:1 (February, 1997), pages 62-73, which is incorporated herein by reference. This article describes the use of “basis” (or building block) images, which correspond to certain types of integrated circuit patterns, in order to compute aerial images by OCD.

Von Bunau et al. describe a related method of OCD in “Optimal Coherent Decompositions for Radially Symmetric Optical Systems,” Journal of Vacuum Science and Technology B15:6 (November/December, 1997), pages 2412-2416, which is incorporated herein by reference. The authors show that for optical systems that are radially symmetrical, the point spread functions and pupil functions corresponding to each term in the OCD expansion are separable in polar coordinates. They derive analytical expressions for the angular dependence of these terms and an integral equation for the radial dependence.

A number of methods for aerial image simulation have been described in the patent literature. For example, U.S. Pat. No. 6,223,139, whose disclosure is incorporated herein by reference, describes kernel-based fast aerial image computation for a large-scale design of integrated circuit patterns. The method is based on determining an appropriate sampling range and sampling interval for use in generating simulated aerial images of a mask pattern, so as to enhance computation speed without sacrificing accuracy. U.S. Patent Application Publication 2002/0062206, whose disclosure is also incorporated herein by reference, describes a method for fast aerial image simulation, using a kernel that is calculated based on an orthogonal pupil projection of the parameters of the optical projection system onto a basis set. A vector is calculated based on an orthogonal mask projection of the parameters of the mask onto the basis set, and the field intensity distribution in the image plane is then calculated using the kernel and the vector.

SUMMARY OF THE INVENTION

Embodiments of the present invention provide improved methods and systems for computing a coherent decomposition of a partially-coherent optical system. The decomposition uses expansion functions, including both radial and angular components, which are expressed in explicit form, thus providing greater ease and speed of computation than OCD methods known in the art. The methods of OCD taught by the present invention are particularly useful in simulating aerial images produced by a projection system, such as the aerial image projected by a photolithographic mask.

In some embodiments of the present invention, the radial components of the expansion functions comprise Bessel functions. These functions typically depend, inter alia, on the degree of coherence of the optical system, σ, which is expressed as a ratio of the numerical aperture (NA) of the illumination optics to the NA of the projection optics. Additionally or alternatively, the expansion functions may contain polynomial functions, such as Zernike polynomials, in order to account for the effect of aberrations in the optical system.

In some embodiments of the present invention, OCD is applied to simulate aerial images that are produced by a mask under inspection in a mask inspection system. The system reads the mask transmission function from a design database, and computes a simulated aerial image based on the transmission function and on the optical characteristics of the mask inspection system. An imaging device, such as an electronic imaging camera, captures the actual aerial image that is formed by the mask in the inspection system, and an image processor compares the actual image to the simulated image. When the processor finds a significant deviation between the actual and simulated aerial images, it notes the corresponding location on the mask as a potential defect site. This approach permits detection of defects, such as phase defects, that are not visible under direct inspection.

In other embodiments of the present invention, OCD is used to measure the aberrations of an optical projection system, by comparing a simulated aerial image to an actual aerial image of a known reference pattern. In these embodiments, the expansion functions are used to generate the simulated image assuming small aberrations of variable magnitude. The magnitude of the aberrations is then adjusted so as to fit the simulated image to the actual image. The optimal fit parameters indicate the magnitudes of the actual aberrations.

There is therefore provided, in accordance with an embodiment of the present invention, a method for generating a simulated aerial image of a mask projected by an optical system, including:

determining a coherence characteristic of the optical system;

computing a coherent decomposition of the optical system based on the coherence characteristic, the decomposition including a series of expansion functions having angular and radial components that are expressed as explicit functions; and

convolving the expansion functions with a transmission function of the mask in order to generate the simulated aerial image.

Typically, determining the coherence characteristic includes calculating a coherence ratio of the optical system. In a disclosed embodiment, the expansion functions are components of eigenfunctions of a kernel function representing characteristics of the optical system, and computing the coherent decomposition includes determining the expansion functions without computing eigenvalues of the kernel function.

In some embodiments, computing the coherent decomposition includes determining one or more aberrations of the optical system, and computing the expansion functions responsively to the aberrations. Typically, determining the one or more aberrations includes representing a point spread function (PSF) of the optical system as a series of aberration terms having respective coefficients, and computing the expansion functions includes calculating the expansion functions responsively to the respective coefficients of the aberration terms. In one embodiment, calculating the expansion functions includes expressing the PSF as a Taylor series in terms of the coefficients of the aberration terms. Additionally or alternatively, computing the expansion functions includes calculating the expansion functions responsively to a defocus of the optical system.

There is also provided, in accordance with an embodiment of the present invention, a method for generating a simulated aerial image of a mask projected by an optical system, including:

defining a kernel function representing characteristics of the optical system;

computing a coherent decomposition of the optical system into a series of expansion functions, which are components of eigenfunctions of the kernel function, without computing eigenvalues of the kernel function; and

convolving the expansion functions with a transmission function of the mask in order to generate the simulated aerial image.

There is additionally provided, in accordance with an embodiment of the present invention, a method for inspecting a mask, including:

projecting radiation through the mask using an optical system so as to form an actual aerial image;

deriving a transmission function from a design of the mask;

generating a simulated aerial image by convolving a mathematical model of the optical system with the transmission function; and

comparing the simulated aerial image to the actual aerial image in order to identify a defect in the mask.

There is further provided, in accordance with an embodiment of the present invention, a method for determining characteristics of an optical system, including:

projecting radiation through a mask using the optical system so as to form a reference aerial image;

generating a simulated aerial image by convolving a transmission function of the mask with a mathematical model of the optical system, the model including a variable parameter representative of a magnitude of an aberration of the optical system; and

comparing the simulated aerial image to the reference aerial image in order to determine the magnitude of the aberration.

In some embodiments, comparing the simulated aerial image to the reference aerial image includes defining an error function responsively to a difference between the simulated aerial image to the reference aerial image, and varying the parameter so as to minimize the error function. Typically, generating the simulated aerial image includes representing a point spread function (PSF) of the optical system as a series of aberration terms having respective coefficients, and varying the parameter includes adjusting the respective coefficients of the aberration terms. In one embodiment, representing the PSF includes expressing the PSF as a Taylor series in terms of the coefficients of the aberration terms.

In another embodiment, generating the simulated aerial image includes representing a point spread function (PSF) of the optical system in terms of a defocus of the optical system, and varying the parameter includes determining a magnitude of the defocus. Typically, representing the PSF includes expressing the PSF as a product of a first series of first terms representing the defocus by a second series of second terms representing the aberration.

In a disclosed embodiment, generating the simulated aerial image includes determining a coherence characteristic of the optical system, and creating the mathematical model as a coherent decomposition of the optical system based on the coherence characteristic, the decomposition including a series of expansion functions having angular and radial components that are expressed as explicit functions. Additionally or alternatively, generating the simulated aerial image includes defining a kernel function representing characteristics of the optical system, an creating the mathematical model as a coherent decomposition of the optical system into a series of expansion functions, which are components of eigenfunctions of the kernel function, without computing eigenvalues of the kernel function.

There is moreover provided, in accordance with an embodiment of the present invention, apparatus for generating a simulated aerial image of a mask projected by an optical system, including:

a memory, which is arranged to store a transmission function of a mask; and

an image processor, which is adapted to compute a coherent decomposition of the optical system based on a coherence characteristic of the optical system, the decomposition including a series of expansion functions having angular and radial components that are expressed as explicit functions, to read the transmission function from the memory, and to convolve the expansion functions with the transmission function in order to generate the simulated aerial image.

There is furthermore provided, in accordance with an embodiment of the present invention, apparatus for generating a simulated aerial image of a mask projected by an optical system, including:

a memory, which is arranged to store a transmission function of a mask; and

an image processor, which is adapted to receive a definition of a kernel function representing characteristics of the optical system, and to compute a coherent decomposition of the optical system into a series of expansion functions, which are components of eigenfunctions of the kernel function, without computing eigenvalues of the kernel function, and to read the transmission from the memory and convolve the expansion functions with the transmission function in order to generate the simulated aerial image.

There is also provided, in accordance with an embodiment of the present invention, apparatus for inspecting a mask, including:

an optical system, which is adapted to project radiation through the mask so as to form an actual aerial image; and

an image processor, which is coupled to receive the actual aerial image and is adapted, responsively to a transmission function derived from a design of the mask, to generate a simulated aerial image by convolving a mathematical model of the optical system with the transmission function, and to compare the simulated aerial image to the actual aerial image in order to identify a defect in the mask.

There is additionally provided, in accordance with an embodiment of the present invention, apparatus for determining characteristics of an optical system, including:

an imaging device, which is arranged capture a reference aerial image formed by projecting radiation through a mask using the optical system; and

an image processor, which is adapted to generate a simulated aerial image by convolving a transmission function of the mask with a mathematical model of the optical system, the model including a variable parameter representative of a magnitude of an aberration of the optical system, and to compare the simulated aerial image to the reference aerial image in order to determine the magnitude of the aberration.

There is further provided, in accordance with an embodiment of the present invention, a computer software product for generating a simulated aerial image of a mask projected by an optical system, the product including a computer-readable medium in which program instructions are stored, which instructions, when read by a computer, cause the computer to compute a coherent decomposition of the optical system based on a coherence characteristic of the optical system, the decomposition including a series of expansion functions having angular and radial components that are expressed as explicit functions, and to convolve the expansion functions with a transmission function of the mask in order to generate the simulated aerial image.

There is moreover provided, in accordance with an embodiment of the present invention, a computer software product for generating a simulated aerial image of a mask projected by an optical system, the product including a computer-readable medium in which program instructions are stored, which instructions, when read by a computer, cause the computer to compute a coherent decomposition of the optical system into a series of expansion functions, which are components of eigenfunctions of a kernel function representing characteristics of the optical system, without computing eigenvalues of the kernel function, and to convolve the expansion functions with a transmission function of the mask in order to generate the simulated aerial image.

There is furthermore provided, in accordance with an embodiment of the present invention, a computer software product for use in inspection of a mask, the product including a computer-readable medium in which program instructions are stored, which instructions, when read by a computer, cause the computer to receive an actual aerial image of the mask projected by an optical system, and responsively to a transmission function derived from a design of the mask, to generate a simulated aerial image by convolving a mathematical model of the optical system with the transmission function, and to compare the simulated aerial image to the actual aerial image in order to identify a defect in the mask.

There is also provided, in accordance with an embodiment of the present invention, a computer software product for determining characteristics of an optical system, the product including a computer-readable medium in which program instructions are stored, which instructions, when read by a computer, cause the computer to receive a reference aerial image formed by projecting radiation through a mask using the optical system, and to generate a simulated aerial image by convolving a transmission function of the mask with a mathematical model of the optical system, the model including a variable parameter representative of a magnitude of an aberration of the optical system, and to compare the simulated aerial image to the reference aerial image in order to determine the magnitude of the aberration.

The present invention will be more fully understood from the following detailed description of the embodiments thereof, taken together with the drawings in which:

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a schematic side view of a system for mask projection and inspection, in accordance with an embodiment of the present invention;

FIG. 2 is a flow chart that schematically illustrates a method for aerial image simulation, in accordance with an embodiment of the present invention;

FIG. 3 is a plot showing eigenvalues of an OCD expansion, in accordance with an embodiment of the present invention;

FIGS. 4A-4D are plots that schematically illustrate eigenfunctions found in an OCD expansion, in accordance with an embodiment of the present invention;

FIG. 5 is a flow chart that schematically illustrates a method for computing expansion functions for use in simulating an aerial image, in accordance with an embodiment of the present invention;

FIGS. 6A and 6B are plots of simulated aerial image intensity, in accordance with an embodiment of the present invention; and

FIG. 7 is a flow chart that schematically illustrates a method for measuring aberrations of an optical system, in accordance with an embodiment of the present invention.

DETAILED DESCRIPTION OF EMBODIMENTS

FIG. 1 is a schematic side view of a system 20 for projection of a mask 22 onto a target plane 24, in accordance with an embodiment of the present invention. Typically, mask 22 embodies a predetermined design for a thin film layer that is to be formed by photolithography on a substrate at plane 24, as is known in the art. The design is characterized by a complex transmission function T({right arrow over (x)}). Alternatively, system 20 may be used in projection of patterns of other types, and the term “mask” should be understood to comprise substantially any sort of object carrying a pattern that can be projected in this manner onto a target plane. Furthermore, the principles of the present invention may also be applied in projection systems that are based on reflection of radiation from mask 22, as well as other types of systems that use optical projection and/or microscopy.

An illumination source 26 emits radiation, which typically comprises visible, ultraviolet or infrared radiation. A condenser lens 28, having an aperture 30, focuses the light from source 26 through mask 22. A projection lens 32, having an aperture 34, focuses an aerial image of mask 22 onto plane 24. Typically, lenses 28 and 32 comprise complex, multi-element lenses. The respective apertures 30 and 34 and respective distances of lenses 28 and 32 from mask 22 define respective numerical apertures, NA_(illumination) and NA_(aperture), as is known in the art. The coherence ratio of system 20 is then given by:

$\begin{matrix} {\sigma = \frac{{NA}_{illumination}}{{NA}_{projection}}} & (2) \end{matrix}$

More generally, the term NA_(aperture) may be replaced by the numerical aperture of the collection optics, NA_(collection). Typically, the higher the coherence of the illumination, the lower will be the value of σ.

System 20 may be used, as well, for inspection of mask 22, typically for purposes of detecting mask defects. In this case, an electronic imaging device, such as a video camera 36, captures the actual aerial image formed at plane 24 with high resolution. An image processor 37 generates a simulated aerial image, based on the known design of mask 22, which is stored in a design database 38. The mask design determines the transmission function T({right arrow over (x)}). The simulated aerial image is computed, using novel techniques that are described hereinbelow, based on T({right arrow over (x)}) and on the optical characteristics of system 20. Image processor 37 compares the actual aerial image to the simulated aerial image, in order to detect discrepancies between the two, which may be indicative of defects in mask 22. Generally speaking, the image processor marks the locations of these defects for further inspection, cleaning or repair, as appropriate.

Image processor 37 typically comprises a general-purpose computer, which performs the functions described in the present patent application under the control of suitable software. The software may be downloaded to the computer in electronic form, over a network, for example, or it may alternatively be provided on tangible media, such as optical, magnetic or non-volatile electronic storage media. Alternatively or additionally, at least some of the functions of the image processor may be performed by dedicated or programmable hardware components, such as a digital signal processor.

FIG. 2 is a flow chart that schematically illustrates a method for aerial image simulation, in accordance with an embodiment of the present invention. The simulated aerial image that is determined in this fashion may be used for finding defects in mask 22, as described above. Alternatively, the simulated aerial image may be used for other purposes, such as testing the expected performance of mask 22 during the design stage to verify that the mask will create the precise, desired features by photolithography on a semiconductor wafer. Further alternatively or additionally, the simulated aerial image may be used in measuring optical aberrations in system 20, as described hereinbelow with reference to FIG. 7. Other applications of these simulated aerial images will be apparent to those skilled in the art.

As a preliminary step, processor 37 reads data defining the design of mask 22 from database 38, at a mask transmission input step 40. The data may include the actual transmission function T({right arrow over (x)}) of the mask. Alternatively, the processor may compute the transmission function in a straightforward manner based on the layout of opaque and phase-shifting elements in the mask design. Processor 37 also reads in or determines calibration data regarding projection system 20, at a system parameter input step 42. These data enable the processor to calculate the coherence ratio of the system, as given by equation. In addition, to the extent that known aberrations in projection lens 32 are to be taken into account, as described hereinbelow with reference to FIG. 5, the calibration data may include a model of the aberrations.

Based on the optical characteristics of system 20, processor 37 computes a set of expansion functions for use in modeling aerial images produced by the system, at an expansion step 44. The derivation of these expansion functions is described in detail in Appendix A. Briefly, in an ideal, cylindrically-symmetrical system, without aberrations, the functions and coordinates appearing in the Hopkins formula (equation (1)) may be expressed in polar coordinates as follows:

$\begin{matrix} {{{{A\left( \overset{\rightarrow}{\eta} \right)} = \frac{J_{1}\left( \rho_{1} \right)}{\rho_{1}}};{{A\left( \overset{\rightarrow}{\iota} \right)} = \frac{J_{1}\left( \rho_{2} \right)}{\rho_{2}}};}{{{J\left( {{\overset{\rightarrow}{\eta} - \overset{\rightarrow}{\iota}}} \right)} = \frac{J_{1}\left( {\sigma \sqrt{\rho_{1} + \rho_{2} - {2\rho_{1}\rho_{2}\cos \; (\theta)}}} \right)}{\sigma \sqrt{\rho_{1} + \rho_{2} - {2\rho_{1}\rho_{2}\cos \; (\theta)}}}};}{{\eta_{x} = {\rho_{1}{\cos \left( \theta_{1} \right)}}};}{{\eta_{y} = {\rho_{1}{\sin \left( \theta_{1} \right)}}};}{{\iota_{x} = {\rho_{2}{\cos \left( \theta_{2} \right)}}};}{{\iota_{y} = {\rho_{2}{\sin \left( \theta_{2} \right)}}};}{{\theta = {\theta_{1} - \theta_{2}}};}} & (3) \end{matrix}$

In these equations, J₁ is the first-order Bessel function, and ρ is a normalized radial coordinate given by:

$\begin{matrix} {\rho = {\frac{2\pi}{\lambda}{{NA}_{projection} \cdot \sqrt{\left( {x^{2} + y^{2}} \right)}}}} & (4) \end{matrix}$

The transmission function T may similarly be converted to polar coordinates.

The part of the Hopkins formula that is dependent on the optical system can be expressed in terms of an optimal coherent decomposition (OCD) of a kernel K into a truncated series of eigenfunctions φ_(i):

$\begin{matrix} \begin{matrix} {{K\left( {\rho_{1},\rho_{2},\theta_{1},\theta_{2}} \right)} = {{A\left( \overset{\rightarrow}{\eta} \right)} \cdot {A^{*}\left( \overset{\rightarrow}{\iota} \right)} \cdot {J\left( {{\overset{\rightarrow}{\eta} - \overset{\rightarrow}{\iota}}} \right)}}} \\ {= {\sum\limits_{i = 1}^{n}\; {\lambda_{i} \cdot {\varphi_{i}\left( {\rho_{1},\theta_{1}} \right)} \cdot {\varphi_{i}\left( {\rho_{2},\theta_{2}} \right)}}}} \end{matrix} & (5) \end{matrix}$

The eigenvalues λ_(i) and eigenfunctions φ_(i) in equation (5) are solutions of the inhomogeneous Fredholm equation:

$\begin{matrix} {{\int_{0}^{\infty}{\int_{0}^{2\pi}{{K\left( {\rho_{1},\rho_{2},\theta_{1},\theta_{2}} \right)}\rho_{2}{\varphi \left( {\rho_{2},\theta_{2}} \right)}{\rho_{2}}\ {\theta_{2}}}}} = {{\lambda\varphi}\left( {\rho_{1},\theta_{1}} \right)}} & (6) \end{matrix}$

As shown in Appendix A, equation (6) can be solved explicitly, using the coherence ratio σ and equations (3), to give the following set of eigenfunctions:

$\begin{matrix} {{{{{}_{}^{}{}_{}^{}}\left( {\rho,\theta} \right)} = {\sqrt{2}{\sum\limits_{k = m}^{M}\; {B_{k}^{m,i}\sqrt{\left( {1 + k} \right)}{\frac{J_{1}(\rho)}{\rho} \cdot \frac{J_{1 + k}\left( {\sigma \; \rho} \right)}{\sigma \; \rho}}\; {\cos \left( {m\; \theta} \right)}}}}}{{{{}_{}^{}{}_{}^{}}\left( {\rho,\theta} \right)} = {\sqrt{2}{\sum\limits_{k = m}^{M}\; {B_{k}^{m,i}\sqrt{\left( {1 + k} \right)}{\frac{J_{1}(\rho)}{\rho} \cdot \frac{J_{1 + k}\left( {\sigma \; \rho} \right)}{\sigma \; \rho}}\; {\sin \left( {m\; \theta} \right)}}}}}} & (7) \end{matrix}$

Here J_(n)(ρ) is the nth-order Bessel function, m≧0 is an angular index, and the terms B_(k) ^(m,i) are eigenvectors, whose derivation is described in detail below in Appendix A. The upper bound M is chosen heuristically so that the residual error in computation of the eigenfunctions is within a desired limit.

Reference is now made to FIGS. 3 and 4A-D, which illustrate an exemplary solution to equation (6) for σ=0.5, in accordance with an embodiment of the present invention. FIG. 3 shows the set of eigenvalues λ_(i), arranged as a function of the angular index m. FIGS. 4A-D show the spatial Fourier transforms of the first four eigenfunctions, i.e., the eigenfunctions that correspond to the four greatest eigenvalues, as given by equation (7). The spatial frequency ν in FIGS. 4A-D is normalized in terms of the cutoff frequency

$\nu_{cutoff} = {{\frac{{NA}_{projection}}{\lambda}\mspace{14mu}\left\lbrack \frac{1p}{m\; m} \right\rbrack}.}$

It can be shown that the frequency spectra of the eigenfunctions contain no components whose frequency is greater than ν_(max)=(1+σ)ν_(cutoff).

It will be observed that the eigenvalues shown in FIG. 3 drop rapidly with increasing order. The error incurred by truncating the OCD series of equation (5) after n terms is bounded by

$ɛ = {\sum\limits_{i = {n + 1}}^{\infty}\; {\lambda_{i}^{2}.}}$

The value of n can thus be chosen based on the desired error bound in the simulated aerial image. The inventor has found that for low values of σ (about 0.3 or less), 5-10 terms are sufficient to give an error bound of 1/256, while higher values of σ require more terms, typically as many as 50 terms for σ=0.8, for example. Furthermore, it can be shown that if m₁>m₂, then the first eigenvalue corresponding to angular index m₁ will always be greater than the first eigenvalue corresponding to angular index m₂. Therefore, it is sufficient to compute the eigenfunctions of equation (7) out to angular index n in order to be assured of including at least the first n eigenfunctions (in order of magnitude of the corresponding eigenvalues).

This result is used in determining the number of expansion functions that must be computed at step 44. The expansion functions Θ_(m,k) are defined as follows:

$\begin{matrix} {{{{{}_{}^{}{}_{}^{}}\left( {\rho,\theta} \right)} = {\sum\limits_{k = m}^{M}\; {B_{k}^{m,i}{\Theta_{m,k}^{c}\left( {\rho,\theta} \right)}{\cos \left( {m\; \theta} \right)}}}}{{{{}_{}^{}{}_{}^{}}\left( {\rho,\theta} \right)} = {\sum\limits_{k = m}^{M}\; {B_{k}^{m,i}{\Theta_{m,k}^{s}\left( {\rho,\theta} \right)}{\sin \left( {m\; \theta} \right)}}}}} & (8) \end{matrix}$

So that:

$\begin{matrix} {{{\Theta_{m,k}^{c}\left( {\rho,\theta} \right)} \equiv {\sqrt{\left( {1 + k} \right)}{\frac{J_{1}(\rho)}{\rho} \cdot \frac{J_{1 + k}\left( {\sigma \; \rho} \right)}{\sigma \; \rho}}\cos \; \left( {m\; \theta} \right)}}{{\Theta_{m,k}^{s}\left( {\rho,\theta} \right)} \equiv {\sqrt{\left( {1 + k} \right)}{\frac{J_{1}(\rho)}{\rho} \cdot \frac{J_{1 + k}\left( {\sigma \; \rho} \right)}{\sigma \; \rho}}\sin \; \left( {m\; \theta} \right)}}} & (9) \end{matrix}$

In other words, the expansion functions are determined independently of the eigenvectors of equation (5). Alternatively, the eigenfunctions themselves, as given by equation (7), for example, may be used as the expansion functions.

Returning to equations (1) and (5), the aerial image intensity at each point in target plane 24 is given by:

$\begin{matrix} \begin{matrix} {{I\left( \overset{\rightarrow}{x} \right)} = {T \otimes K}} \\ {= {\sum\limits_{m = 0}^{M}\; {\sum\limits_{i = 1}^{n}\; {\lambda_{i}\left\{ {{{T \otimes {{}_{}^{}{}_{}^{}}}}^{2} + {{T \otimes {{}_{}^{}{}_{}^{}}}}^{2}} \right\}}}}} \end{matrix} & (10) \end{matrix}$

wherein the symbol {circle around (×)} represents convolution. In order to carry out this computation without having to compute the actual eigenvalues, processor 37 convolves the transmission function of mask 22 with the expansion functions of the optical system, at a convolution step 46, giving the convolution components:

ψ_(m,i) ^(C)(ρ,θ)≡Θ_(m,k) ^(c)(ρ,θ)

T(ρ,θ)

ψ_(m,i) ^(S)(ρ,θ)≡Θ_(m,k) ^(s)(ρ,θ)

T(ρ,θ)  (11)

wherein i=k−m. Due to the orthogonality of the eigenvectors, substitution of the expansion functions into equation (10) in place of the eigenfunctions permits the equation to be rewritten in terms of the convolution components:

$\begin{matrix} {{I\left( \overset{\rightarrow}{x} \right)} = {{\sum\limits_{m = 0}^{M}{\sum\limits_{i = 1}^{n}{\psi_{m,i}^{C}}^{2}}} + {\psi_{m,i}^{S}}^{2}}} & (12) \end{matrix}$

Thus, the aerial image intensity may be computed explicitly, to arbitrary precision, based on the expansion functions Θ_(m,k) and the known mask transmission function T.

FIG. 5 is a flow chart that schematically illustrates a method for computing the expansion functions (at step 44 in FIG. 2) to take into account aberrations in system 20, in accordance with an embodiment of the present invention. In the derivation above, ideal optics were assumed in equations (3), so that the point spread function (PSF) of the projection system could be expressed as A(ρ,

${\left. \theta \right) = \frac{J_{1}(\rho)}{\rho}},$

leading to the form of the eigenfunctions that is given above in equations (7). Alternatively, however, the full polar form of the PSF may be carried through the derivation, leading to the more general form of the eigenfunctions:

$\begin{matrix} {{{{{}_{}^{}{}_{}^{}}\left( {\rho,\theta} \right)} = {\sqrt{2}{\sum\limits_{k = 1}^{M}{B_{k}^{m,i}{\sqrt{\left( {1 + k} \right)} \cdot {A\left( {\rho,\theta} \right)}}\frac{J_{1 + k}\left( {\sigma \; \rho} \right)}{\sigma \; \rho}{\cos \left( {\left( {k - {2m}} \right)\theta} \right)}}}}}{{{{}_{}^{}{}_{}^{}}\left( {\rho,\theta} \right)} = {\sqrt{2}{\sum\limits_{k = 1}^{M}{B_{k}^{m,i}{\sqrt{\left( {1 + k} \right)} \cdot {A\left( {\rho,\theta} \right)}}\frac{J_{1 + k}\left( {\sigma \; \rho} \right)}{\sigma \; \rho}{\sin \left( {\left( {k - {2m}} \right)\theta} \right)}}}}}} & \left\lbrack {{equation}\mspace{14mu} (13)} \right\rbrack \end{matrix}$

Thus, in the embodiment shown in FIG. 5, in order to compute the expansion functions, processor 37 receives a definition of the optical aberrations of system 20, at an aberration definition step 50. Based on these aberrations, the processor calculates the PSF of the system, A(ρ, θ), at a PSF determination step 52. For small aberrations, the PSF may be expressed as a combination of Bessel functions representing the aberrations that are present in the system:

$\begin{matrix} {{A_{astigmatic} \approx {\frac{2{J_{1}(\rho)}}{\rho} - {\; \alpha_{astig}\frac{2{J_{3}(\rho)}}{\rho}{\cos \left( {2\; \theta} \right)}}}}{A_{coma} \approx {\frac{2\; {J_{1}(\rho)}}{\rho} - {\alpha_{coma}\frac{2{J_{4}(\rho)}}{\rho}\cos \; \theta}}}{A_{spherical} \approx {\frac{2{J_{1}(\rho)}}{\rho} + {\; \alpha_{spherical}\frac{2{J_{5}(\rho)}}{\rho}}}}} & (14) \end{matrix}$

In each of these equations, α represents the magnitude of the corresponding aberration. Based on these equations, it is possible to use expressions for the eigenfunctions φ or expansion functions Θ in calculating the simulated aerial image.

Alternatively, the PSF of system 20 may be expressed as the truncated Fourier transform (limited to the radial range from 0 to 1) of the aperture function:

$\begin{matrix} {{A\left( {\rho,\theta} \right)} = {\frac{}{\lambda}{E \cdot {NA}^{2}}^{{ \cdot {NA}^{2}}u}{\int_{0}^{1}{\int_{0}^{2\pi}{\eta \cdot {\eta} \cdot {\phi} \cdot ^{{\lbrack{{{- \eta}\; \rho \; {\cos {({\theta - \phi})}}} + {k\; {\Phi {({Y^{*},\phi,\eta})}}} - {\frac{1}{2}u\; \eta^{2}}}\rbrack}}}}}}} & \left\lbrack {{equation}\mspace{14mu} (15)} \right\rbrack \end{matrix}$

In this equation:

-   -   E is the field amplitude.     -   k is the wave number, i.e.

$k = {\frac{2\pi}{\lambda}.}$

-   -   u is the normalized defocus, i.e. u=k·NA²·z, wherein z is the         actual defocus distance.     -   Φ is the phase aberration function.     -   Y* is the coordinate of the object point

Equation (15) may be integrated, using a suitable representation for Φ, in order to determine A(ρ, θ) completely.

For sufficiently small enough object area, it can generally be assumed that Φ does not depend on Y*. In this case, Φ(ρ, θ) can be expanded in a series of Zernike polynomials R_(n) ^(m), as described, for example, by Born & Wolf in Principles of Optics, 4^(th) edition (Pergamon Press, 1970), in section 9.2, pages 464-467, which is incorporated herein by reference. Zernike representations of common aberrations include:

TABLE I ZERNIKE POLYNOMIALS Spherical aberration ${\frac{1}{\sqrt{2}}A_{040}{R_{4}^{0}(\rho)}} = {\frac{1}{\sqrt{2}}{A_{040}\left( {{6\rho^{4}} - {6\rho^{2}} + 1} \right)}}$ Coma ${A_{031}R\frac{1}{3}(\rho){\cos (\theta)}} = {{A_{031}\left( {{3\rho^{3}} - {2\rho}} \right)}{\cos (\theta)}}$ Astigmatism A₀₂₂R₂²(ρ)cos (2θ) = A₀₂₂ρ²cos (2θ)

The magnitudes of the aberrations are represented by the corresponding Zernlike polynomial coefficients A_(lnm). Standard software packages for optical design are capable of computing these coefficients for substantially any design that they generate.

The actual PSF determined at step 52, including aberrations, is then used in computing the expansion functions Θ_(m,k) that are applicable to the actual system 20, at an expansion step 54:

$\begin{matrix} {{{\Theta_{m,k}^{c}\left( {\rho,\theta} \right)} \equiv {\sqrt{\left( {1 + k} \right)}{{A\left( {\rho,\theta} \right)} \cdot \frac{J_{1 + k}\left( {\sigma \; \rho} \right)}{\sigma \; \rho}}{\cos \left\lbrack {\left( {k - {2m}} \right)\theta} \right\rbrack}}}{{\Theta_{m,k}^{s}\left( {\rho,\theta} \right)} \equiv {\sqrt{\left( {1 + k} \right)}{{A\left( {\rho,\theta} \right)} \cdot \frac{J_{1 + k}\left( {\sigma \; \rho} \right)}{\sigma \; \rho}}{\sin \left\lbrack {\left( {k - {2m}} \right)\theta} \right\rbrack}}}} & (16) \end{matrix}$

These expansion functions are a generalized form of the expansion functions for a perfect optical system shown in equation (9). They are convolved with the transmission function T(ρ, θ) at step 46, in order to determine the aerial image intensity I, as given by equations (11) and (12).

FIGS. 6A and 6B are plots that schematically illustrate simulated aerial image intensity calculated using the method described above, in accordance with an embodiment of the present invention. FIG. 6B is an enlarged detail of a small portion of FIG. 6A. The plots show the variation of intensity along a cross-section through a pattern of 0.5 μm lines with a pitch of 1 μm. In this example, λ=193 nm, NA=0.175 and the defocus (z) is 4 μm. The plots compare the “fast OCD” simulated image intensity, calculated using equations (11), (12) and (16), to a full “baseline” calculation computed by numerical integration of equation (1). The difference in the results of the two computations are insignificant, even though the defocus assumed in this case is substantially larger than what is normally encountered in practical photolithography and mask inspection systems. Similar results are obtained in simulation of other aberrations, such as astigmatism and coma.

FIG. 7 is a flow chart that schematically illustrates a method for determining the aberrations in an optical system, such as system 20, in accordance with an embodiment of the present invention. The method uses the expansion functions given by equation (16) in a converse manner to that described above: Whereas in the method of FIG. 5, the expansion functions are determined based on known aberrations of the optical system, in the method of FIG. 7, the expansion functions are used, together with an actual aerial image, to estimate aberrations that are not exactly known. This method may thus be used, for example, to calibrate an optical system used in mask inspection or lithography.

To estimate the aberrations in system 20, a reference aerial image I_(reference) is acquired by camera 36, at an image acquisition step 60. This image is acquired using mask 22 with a known reference pattern, giving a transmission function T_(reference). An aberration vector is defined, based on the magnitudes of the aberrations that are expected to be present in system 20, for example:

α=(u_(defocus),α_(coma),α_(spherical))  (17)

Although for the sake of simplicity, this vector includes only defocus, coma and spherical aberration components, ā may be expanded to include other aberrations, as well. The PSF of system 20 may then be expressed as A(ρ, θ, α), which can be expanded in terms of the aberrations, as shown above in equation (14), for example. Other, more precise parametric representations of the PSF are described hereinbelow. The expanded form of the PSF is inserted into equation (16), so that the expansion functions Θ_(m,k) explicitly contain the magnitudes of the aberration components as variable parameters. The expansion functions are then inserted into equations (11) and (12) in order to compute a simulated aerial image I_(simulated)({right arrow over (x)}, α), at a simulation step 62.

The simulated aerial image, like the expansion functions on which it is based, contains the components of α as parameters. In order to determine the values of these parameters in system 20, an error function is defined, based on the actual, reference image captured at step 60:

$\begin{matrix} {{ɛ\left( \overset{\_}{\alpha} \right)} = {{\underset{S}{\int\int}\left\lbrack {{I_{measured}\left( \overset{\rightharpoonup}{x} \right)} - {I_{simulated}\left( {\overset{\rightharpoonup}{x},\overset{\_}{\alpha}} \right)}} \right\rbrack}^{2}{\overset{\rightharpoonup}{x}}}} & (18) \end{matrix}$

Here the integral is taken over target plane 24. Methods of optimization known in the art may then be used to find the vector α that minimizes the error function ε, at an optimization step 64. The components of ā (u_(defocus), α_(coma), α_(spherical)) that are found in this manner give the magnitudes of the actual aberrations in system 20.

Various methods may be used to expand the PSF A(ρ, θ, α) in terms of the aberration vector components. One method is a Taylor series:

$\begin{matrix} {{A\left( {\rho,\theta,\overset{\_}{\alpha}} \right)} = {{A\left( {\rho,\theta} \right)}_{\overset{\_}{\alpha} = 0}{{+ {\nabla_{\alpha}A}}_{\overset{\_}{\alpha} = 0}{{{{\cdot \Delta}\; \overset{\_}{\alpha}} + {\frac{1}{2}{\nabla_{\alpha}{\nabla_{\alpha}A}}}}_{\overset{\_}{\alpha} = 0}{{{\cdot \Delta}\; {\overset{\_}{\alpha}}^{2}} + \ldots}}}}} & (19) \end{matrix}$

Here the gradient ∇_(α) is made up of the partial derivatives of A with respect to each of the components of α, and the terms of ∇_(α)∇_(α)A include cross derivatives, as well. Aberration-based convolution components (in place of the generalized convolution components of equation (11)), may then be defined as follows:

$\begin{matrix} {{{{\psi_{m,n}^{C,0}\left( {\rho,\theta,\sigma} \right)} \equiv {{\sqrt{1 + m} \cdot {T\left( {\rho,\theta} \right)}}\left\{ {{\frac{J_{1}(\rho)}{\rho} \cdot \frac{J_{1 + k}\left( {\sigma \; \rho} \right)}{\sigma \; \rho}}{\cos \left\lbrack {\left( {k - {2m}} \right)\theta} \right\rbrack}} \right\}}}{{\psi_{m,n}^{C,\overset{\_}{\alpha}}\left( {\rho,\theta,\sigma} \right)} \equiv {{\sqrt{1 + m} \cdot {T\left( {\rho,\theta} \right)}}\left\{ {{\nabla_{\alpha}A}_{\overset{\_}{\alpha} = 0}{{\cdot \frac{J_{1 + k}\left( {\sigma \; \rho} \right)}{\sigma \; \rho}}{\cos \left\lbrack {\left( {k - {2m}} \right)\theta} \right\rbrack}}} \right\}}}{\psi_{m,n}^{C,{\overset{\_}{\alpha}\overset{\_}{\alpha}}}\left( {\rho,\theta,\sigma} \right)} \equiv {{\sqrt{1 + m} \cdot {T\left( {\rho,\theta} \right)}}\left\{ {{\nabla_{\alpha}{\nabla_{\alpha}A}}_{\overset{\_}{\alpha} = 0}{{\cdot \frac{J_{1 + k}\left( {\sigma \; \rho} \right)}{\sigma \; \rho}}{\cos \left\lbrack {\left( {k - {2m}} \right)\theta} \right\rbrack}}} \right\}}};\ldots} & \left\lbrack {{equation}\mspace{14mu} (20)} \right\rbrack \end{matrix}$

This equation defines the cosine convolution components. The sine convolution components ψ_(m,n) ^(S,0), ψ_(m,n) ^(S, α) (ρ, θ, σ), ψ_(m,n) ^(S, α α) (ρ, θ, σ), . . . are similarly defined. Note that for a given transmission function T, the convolution components may be precomputed offline.

The simulated aerial image may then be represented in terms of the convolution components, as follows:

$\begin{matrix} {{I_{simulated}\left( {\overset{\rightarrow}{x},\overset{\_}{\alpha}} \right)} = {{\sum\limits_{m}{\sum\limits_{n}{{\psi_{m,n}^{C,0} + {\sum\limits_{\overset{\_}{\alpha}\;}{\overset{\_}{\alpha} \cdot \psi_{m,n}^{C,\overset{\_}{\alpha}}}} + {\frac{1}{2}{\sum\limits_{\overset{\_}{\alpha},\overset{\_}{\alpha}}{\overset{\_}{\alpha}\mspace{11mu} {\overset{\_}{\alpha} \cdot \psi_{m,n}^{C,{\overset{\_}{\alpha}\mspace{11mu} \overset{\_}{\alpha}}}}}}} +}}^{2}}} + {{\sin \mspace{14mu} {terms}}}^{2}}} & \left\lbrack {{equation}\mspace{14mu} (21)} \right\rbrack \end{matrix}$

This expression is inserted into equation (18), which is then minimized over the components of α.

In order to simplify the computation of the PSF and its derivatives, it is convenient to separate the defocus u out of the aberration vector α, and to express the aberrations in the PSF in terms of the Zernike polynomials R_(n) ^(m), as given above in Table I, for example:

$\begin{matrix} {{A\left( {\rho,\theta,u,\overset{\_}{\alpha}} \right)} = {\int_{0}^{2\pi}{\int_{0}^{1}{{\exp \left( {{{- }\frac{1}{2}u\; \xi^{2}} - {\; p\; \xi \; {\cos \left( {\theta - \phi} \right)}} + {\; k{\sum\limits_{n\; m}{\alpha_{n\; m}{R_{n}^{m}(\xi)}\cos \; m\; \phi}}}} \right)}{\phi}{\xi}}}}} & \left\lbrack {{equation}\mspace{14mu} (22)} \right\rbrack \end{matrix}$

Here the terms α_(nm) represent the relevant components of α, for example α₄₀ for spherical aberration and α₃₁ for coma. The defocus component in the PSF may conveniently be expressed in terms of the Bauer expansion:

$\begin{matrix} \begin{matrix} {^{{- }\; \frac{1}{2}u\; \xi^{2}} = {^{{- }\frac{u}{4}}\sqrt{\frac{2\pi}{u}}{\sum\limits_{s = 0}^{\infty}{\left( {- } \right)^{s}\left( {{2s} + 1} \right){J_{s + \frac{1}{2}}\left( \frac{u}{4} \right)}{R_{2s}^{0}(\xi)}}}}} \\ {= {^{{- }\frac{u}{4}}\begin{Bmatrix} {\left\lbrack {{2\; {\cos \left( \frac{u}{4} \right)}} + {\sqrt{\frac{u}{4}}{J_{\frac{3}{2}}\left( \frac{u}{4} \right)}}} \right\rbrack + {{- }\sqrt{\frac{\pi \; u}{2}}}} \\ {{\left\lbrack {{J_{\frac{1}{2}}\left( \frac{u}{4} \right)} + {J_{\frac{5}{2}}\left( \frac{u}{4} \right)}} \right\rbrack \left( {{2\xi^{2}} - 1} \right)} + \ldots} \end{Bmatrix}}} \end{matrix} & (23) \end{matrix}$

Inserting the terms of equation (23) into equation (22) allows the PSF to be expressed as the product of a series of defocus expansion terms D_(s) multiplied by a Taylor expansion of the focused PSF, based on the relevant Zernike polynomial coefficients α_(nm):

$\begin{matrix} {{A\left( {\rho,\theta,u,\overset{\_}{\alpha}} \right)} = {\sum\limits_{s}{{D_{s}(u)}.\left\{ {{A_{s}\left( {\rho,\theta} \right)}_{\overset{\_}{\alpha} = 0}{{+ {\sum\limits_{m,n}\frac{\partial A_{s}}{\partial\alpha_{n\; m}}}}_{\overset{\_}{\alpha} = 0}{{{{\cdot \Delta}\; \overset{\_}{\alpha}} + {\frac{1}{2}{\sum\limits_{m,n}{\sum\limits_{m^{\prime},n^{\prime}}\frac{\partial^{2}A_{s}}{{\partial\alpha_{n\; m}}{\partial\alpha_{n^{\prime}m^{\prime}}}}}}}}_{\overset{\_}{\alpha} = 0}{{{\cdot \Delta}\; {\overset{\_}{\alpha}}^{2}} + \ldots}}}} \right\}}}} & \left\lbrack {{equation}\mspace{14mu} (24)} \right\rbrack \end{matrix}$

The constant prefactor in equation (15) is omitted here for simplicity. The terms of this series are determined analytically by differentiating equation (22) and integrating over φ. The results for the leading terms of the Taylor expansion in equation (24) with s=0 and 1 are given in Table II below:

TABLE II DEFOCUS/ABERRATION EXPANSION TERMS D₀ $\left\lbrack {{2\; {\cos \left( \frac{u}{4} \right)}} + {\sqrt{\frac{u}{4}}{J_{\frac{3}{2}}\left( \frac{u}{4} \right)}}} \right\rbrack e^{{- i}\frac{u}{4}}$ A₀ 2∫₀¹J₀(ξp) dξ $\frac{\partial A_{0}}{\partial\alpha_{coma}}$ 2cos (θ)∫₀¹R₃¹(ξ) ⋅ J₁(ξρ)ξ dξ $\frac{\partial^{2}A_{0}}{\partial\alpha_{coma}^{2}}$ ${- \frac{1}{2}}{\int_{0}^{1}{\xi \ {d\xi}\left\{ {R_{3}^{1}(\xi)} \right\}^{2}\left\{ {{J_{0}({\rho\xi})} - {{\cos \left( {2\theta} \right)}{J_{2}({\rho\xi})}}} \right\}}}$ $\frac{\partial A_{0}}{\partial\alpha_{spherical}}$ 2i∫₀¹R₄⁰(ξ)J₀(ξρ)ξ dξ $\frac{\partial^{2}A_{0}}{\partial\alpha_{spherical}^{2}}$ −∫₀¹ξ dξ{R₄⁰(ξ)}²J₀(ρξ) D_(I) ${- i}{\sqrt{\frac{\pi u}{2}}\left\lbrack {{J_{\frac{1}{2}}\left( \frac{u}{4} \right)} + {J_{\frac{5}{2}}\left( \frac{u}{4} \right)}} \right\rbrack}e^{{- i}\frac{u}{4}}$ A₁ 2∫₀¹R₂⁰(ξ)J₀(ξρ)ξ dξ $\frac{\partial A_{1}}{\partial\alpha_{coma}}$ 2cos (θ)∫₀¹R₃¹(ξ)R₂⁰(ξ)J₁(ξρ)ξ dξ $\frac{\partial A_{1}}{\partial\alpha_{spherical}}$ 2i∫₀¹R₄⁰(ξ)R₂⁰(ξ)J₀(ξρ)ξ dξ

The representation of the PSF that is given by equation (24) and Table II is used in computing the cosine and sine convolution components Ψ^(C) and Ψ^(S), as shown above in equation (20). These components are then used in finding I_(simulated), as given by equation (21). Minimization of ε (equation (18)) over u, α_(coma) and α_(spherical) has been found to give a good approximation of the actual aberrations in real optical systems.

It will be appreciated that the embodiments described above are cited by way of example, and that the present invention is not limited to what has been particularly shown and described hereinabove. Rather, the scope of the present invention includes both combinations and subcombinations of the various features described hereinabove, as well as variations and modifications thereof which would occur to persons skilled in the art upon reading the foregoing description and which are not disclosed in the prior art.

APPENDIX A Derivation and Properties of the Kernel Eigenfunctions

For the case of a cylindrically-symmetrical kernel with no aberrations, as given by equation (5), K has the following form:

$\begin{matrix} {{K\left( {\rho_{1},\rho_{2},\theta_{1},\theta_{2}} \right)} = {\frac{J_{1}\left( \rho_{1} \right)}{\rho_{1}} \cdot \frac{J_{1}\left( \rho_{2} \right)}{\rho_{2}} \cdot \frac{J_{1}\left( {\sigma \sqrt{\rho_{1}^{2} + \rho_{2}^{2} - {2\rho_{1}\rho_{2}{\cos \left( {\theta_{1} - \theta_{2}} \right)}}}} \right)}{\sigma \sqrt{\rho_{1}^{2} + \rho_{2}^{2} - {2\rho_{1}\rho_{2}{\cos \left( {\theta_{1} - \theta_{2}} \right)}}}}}} & \left\lbrack {{equation}\mspace{14mu} (25)} \right\rbrack \end{matrix}$

The main obstacle to solving equation (6) is the third part of equation (25). In order to solve this problem, we use Gegenbauer's addition theorem, which implies that:

$\begin{matrix} {{{\varpi = \sqrt{\rho_{1}^{2} + \rho_{2}^{2} - {2\rho_{1}\rho_{2}\; {\cos \left( {\theta_{1} - \theta_{2}} \right)}}}};}{{\frac{J_{1}(\varpi)}{\varpi} = {2{\sum\limits_{k = 0}^{\infty}\; {\left( {1 + k} \right)\frac{J_{1 + k}\left( {\sigma\rho}_{1} \right)}{{\sigma\rho}_{1}}\frac{J_{1 + k}\left( {\sigma\rho}_{2} \right)}{{\sigma\rho}_{2}}{c_{k}^{1}\left( {\cos \left( {\theta_{1} - \theta_{2}} \right)} \right)}}}}};}{{c_{k}^{1}\left( {\cos \left( {\theta_{1} - \theta_{2}} \right)} \right)} = {\sum\limits_{m = 0}^{k}\; {\cos \left\lbrack {\left( {k - {2m}} \right)\left( {\theta_{1} - \theta_{2}} \right)} \right\rbrack}}}} & (26) \end{matrix}$

Using the symmetry of the optical system, the eigenfunctions φ can be factored as follows:

φ(ρ,θ)=φ(ρ)·e ^(imθ)  (27)

Substituting this representation into equation (6) gives:

$\begin{matrix} {{\int_{0}^{\infty}{\int_{0}^{2\pi}{{K\left( {\rho_{1},\rho_{2},{\theta_{1} - \theta_{2}}} \right)}\rho_{2}{{\phi \left( \rho_{2} \right)} \cdot ^{\; m\; \theta_{2}}}{\rho_{2}}{\theta_{2}}}}} = {\left. {{{\lambda\phi}\left( \rho_{1} \right)} \cdot ^{\; m\; \theta_{1}}}\Rightarrow{\int_{0}^{\infty}{\int_{0}^{2\pi}{{K\left( {\rho_{1},\rho_{2},{\theta_{1} - \theta_{2}}} \right)}\rho_{2}{{\phi \left( \rho_{2} \right)} \cdot ^{\; m\; {\theta {({\theta_{2} - \theta_{1}})}}}}\ {\rho_{2}}{\theta_{2}}}}} \right. = {{\lambda\phi}\left( \rho_{1} \right)}}} & (28) \end{matrix}$

To integrate out the Δθ=θ₁−θ₂ dependence of this equation, we note that the only part of equation (26) that is dependent on Δθ is C_(k) ¹. The result of the integration is:

$\begin{matrix} {{\int_{0}^{2\pi}{\sum\limits_{n = 0}^{k}\; {{\cos \left\lbrack {\left( {k - {2n}} \right)\Delta \; \theta} \right\rbrack}^{\; m\; \Delta \; \theta}{\Delta}\; \theta}}} = \left\{ \begin{matrix} {{{m} > k} = 0} \\ {{{\left( {{m} \leq k} \right)\&}\left( {m + {k\mspace{14mu} {even}}} \right)} = {2\pi}} \\ {{{\left( {{m} \leq k} \right)\&}\left( {m + {k\mspace{14mu} {odd}}} \right)} = 0} \end{matrix} \right.} & \left\lbrack {{equation}\mspace{20mu} (29)} \right\rbrack \end{matrix}$

Substituting this result into equation (25) gives the following reduced equation for φ_(n) ^(m)(ρ):

$\begin{matrix} {{\int_{0}^{\infty}{\rho_{2}{{\rho_{2}} \cdot \frac{J_{1}\left( \rho_{1} \right)}{\rho_{1}} \cdot \frac{J_{1}\left( \rho_{2} \right)}{\rho_{2}} \cdot 2}{\sum\limits_{k = m}^{\infty}\; {\left( {1 + k} \right)\frac{J_{1 + k}\left( {\sigma \; \rho_{1}} \right)}{\sigma \; \rho_{1}}\frac{J_{1 + k}\left( {\sigma \; \rho_{2}} \right)}{{\sigma\rho}_{2}}{\phi^{m}\left( \rho_{2} \right)}}}}} = {\lambda^{m}{\phi^{m}\left( \rho_{1} \right)}}} & \left\lbrack {{equation}\mspace{20mu} (30)} \right\rbrack \end{matrix}$

The primed sum “Σ′” in the above equation means that k increases from term to term in the series in steps of 2. This equation can be solved by multiplying both sides by

${\sqrt{\left( {1 + n} \right)}\frac{J_{1}\left( \rho_{1} \right)}{\rho_{1}}\frac{J_{1 + n}\left( {\sigma\rho}_{1} \right)}{{\sigma\rho}_{1}}\rho_{1}},$

and then integrating with respect to ρ₁:

$\begin{matrix} {{2{\sum\limits_{k = m}^{\infty}\; {\int_{0}^{\infty}{\rho_{1}{\rho_{1}}\sqrt{\left( {1 + n} \right)}\sqrt{\left( {1 + k} \right)}\frac{J_{1}\left( \rho_{1} \right)}{\rho_{1}}{\frac{J_{1 + n}\left( {\sigma\rho}_{1} \right)}{{\sigma\rho}_{1}} \cdot \frac{J_{1}\left( \rho_{1} \right)}{\rho_{1}} \cdot \frac{J_{1 + k}\left( {\sigma\rho}_{1} \right)}{{\sigma\rho}_{1}} \cdot {\int_{0}^{\infty}{\rho_{2}{\rho_{2}}\sqrt{\left( {1 + k} \right)}{\frac{J_{1}\left( \rho_{2} \right)}{\rho_{2}} \cdot \frac{J_{1 + k}\left( {\sigma\rho}_{2} \right)}{{\sigma\rho}_{2}}}{\phi^{m}\left( \rho_{2} \right)}}}}}}}} = {\lambda^{m}{\int_{0}^{\infty}{\rho_{1}{\rho_{1}}\sqrt{\left( {1 + n} \right)}\frac{J_{1}\left( \rho_{1} \right)}{\rho_{1}}\frac{J_{1 + n}\left( {\sigma\rho}_{1} \right)}{{\sigma\rho}_{1}}{\phi^{m}\left( \rho_{1} \right)}}}}} & \left\lbrack {{equation}\mspace{20mu} (31)} \right\rbrack \end{matrix}$

We now define:

$\begin{matrix} {{{A_{n,k}^{m} \equiv {\int_{0}^{\infty}{\rho_{1}{\rho_{1}}\sqrt{\left( {1 + n} \right)}\sqrt{\left( {1 + k} \right)}\frac{J_{1}\left( \rho_{1} \right)}{\rho_{1}}{\frac{J_{1 + n}\left( {\sigma\rho}_{1} \right)}{{\sigma\rho}_{1}} \cdot \frac{J_{1}\left( \rho_{1} \right)}{\rho_{1}} \cdot \frac{J_{1 + k}\left( {\sigma\rho}_{1} \right)}{{\sigma\rho}_{1}}}}}};}{B_{k}^{m} \equiv {\int_{0}^{\infty}{\rho_{2}{\rho_{2}}\sqrt{\left( {1 + k} \right)}{\frac{J_{1}\left( \rho_{2} \right)}{\rho_{2}} \cdot \frac{J_{1 + k}\left( {\sigma\rho}_{2} \right)}{{\sigma\rho}_{2}}}{\phi^{m}\left( \rho_{2} \right)}}}}} & \left\lbrack {{equation}\mspace{20mu} (32)} \right\rbrack \end{matrix}$

Equation (31) can then be rewritten simply as:

$\begin{matrix} {{\sum\limits_{k = m}^{\infty}\; {A_{n,k}^{m} \cdot B_{k}^{m}}} = {\lambda^{m,n} \cdot B_{n}^{m}}} & (33) \end{matrix}$

By truncating the sum in this equation, we get an algebraic eigenvalues problem, wherein the eigenfunctions of equation (30) can now be written:

$\begin{matrix} {{\varphi_{i}^{m}\left( {\rho,\theta} \right)} = {\sum\limits_{k = m}^{M}\; {{{}_{}^{}{}_{}^{m,i}}\sqrt{\left( {1 + k} \right)}{\frac{J_{1}(\rho)}{\rho} \cdot \frac{J_{1 + k}({\sigma\rho})}{\sigma\rho}}^{\; m\; \theta}}}} & (34) \end{matrix}$

Since the eigenfunctions φ_(i) ^(m) and φ_(i) ^(−m) are degenerate, we are free to choose any linear combination of them. For convenience, we choose the forms ^(C)φ_(i) ^(m)(ρ, θ) and ^(S)φ_(i) ^(m)(ρ, θ), as given in equation (7).

Because the matrices A_(k,n) ^(m)=A_(n,k) ^(m) are symmetrical, all their eigenvalues are real. The matrices A_(k,n) ^(m) for different values of m can all be derived from A_(k,n) ⁰ by deleting the first m rows and columns:

$\begin{matrix} {{{A^{0} = \begin{pmatrix} a_{11} & \cdots & a_{1M} \\ \vdots & ⋰ & \vdots \\ a_{M\; 1} & \cdots & a_{MM} \end{pmatrix}};}{{A^{1} = \begin{pmatrix} a_{22} & \cdots & a_{2M} \\ \vdots & ⋰ & \vdots \\ a_{M\; 2} & \cdots & a_{MM} \end{pmatrix}};}{{A^{2} = \begin{pmatrix} a_{33} & \cdots & a_{3M} \\ \vdots & ⋰ & \vdots \\ a_{M\; 3} & \cdots & a_{MM} \end{pmatrix}};{{etc}.}}} & (35) \end{matrix}$

According to the Courant-Fisher theorem, if λ₁>λ₂> . . . >λ_(N) are the eigenvalues of A⁰, and μ₁>μ₂> . . . >μ_(N-1) are the eigenvalues of A¹, then:

λ₁<μ₁<λ₂;

λ₂<μ₂<λ₃;

.

.

.

λ_(N-1)<λ_(N-1)<λ_(N);  (36)

This equation implies that the first eigenvalue will always correspond to m=0, and the second always to m=1. Thus, as noted above, in order to construct a kernel that will be certain to include the first n eigenfunctions (i.e., those corresponding to the n greatest eigenvalues), it is sufficient to compute the eigenfunctions (or functions Θ) up to m=n. The consequence of this theorem can be seen in FIG. 3.

The derivation of equation (12) above relied on the orthogonality of the eigenvectors B_(k) ^(m,i). To demonstrate this orthogonality, we note first that the cross-integral between two eigenfunctions of the kernel K can be calculated as follows:

$\begin{matrix} \begin{matrix} {{\langle{{{{}_{}^{}{}_{}^{}}\left( {\rho,\theta} \right)} \cdot {{{}_{}^{}{}_{}^{}}\left( {\rho,\theta} \right)}}\rangle} = {{\int_{0}^{\infty}{\rho {\rho}{\int_{0}^{2\pi}\ {{\theta} \cdot {{{}_{}^{}{}_{}^{}}\left( {\rho,\theta} \right)} \cdot {{{}_{}^{}{}_{}^{}}\left( {\rho,\theta} \right)}}}}} =}} \\ {= {\int_{0}^{\infty}{\rho {\rho}{\int_{0}^{2\pi}\ {{\theta}{\sum\limits_{k = m}^{M}\; {B_{k}^{m,i}\sqrt{\left( {1 + k} \right)}{\frac{J_{1}(\rho)}{\rho} \cdot}}}}}}}} \\ {{\frac{J_{1 + k}({\sigma\rho})}{\sigma\rho}{{\cos \left( {m\; \theta} \right)} \cdot}}} \\ {{\sum\limits_{k^{\prime} = 1}^{M}\; {B_{k^{\prime}}^{1,j}\sqrt{\left( {1 + k^{\prime}} \right)}{\frac{J_{1}(\rho)}{\rho} \cdot}}}} \\ {{\frac{J_{1 + k^{\prime}}({\sigma\rho})}{\sigma\rho}{\cos \left( {1\theta} \right)}}} \\ {= {2{{\pi\delta}\left( {1 - m} \right)}{\int_{0}^{\infty}{\rho {\rho}{\sum\limits_{k = m}^{M}\; \sum\limits_{k^{\prime} = 1}^{M}}}}}} \\ {{B_{k^{\prime}}^{1,j}B_{k}^{m,i}{\sqrt{\left( {1 + k} \right)\left( {1 + k^{\prime}} \right)} \cdot}}} \\ {{{\frac{J_{1}(\rho)}{\rho} \cdot \frac{J_{1 + k}({\sigma\rho})}{\sigma\rho}}{\frac{J_{1}(\rho)}{\rho} \cdot \frac{J_{1 + k^{\prime}}({\sigma\rho})}{\sigma\rho}}}} \\ {= {2{{\pi\delta}\left( {1 - m} \right)}{\sum\limits_{k = m}^{M}\; {\sum\limits_{k^{\prime} = 1}^{M}\; {B_{k^{\prime}}^{m,j}B_{k}^{m,i}{A_{k,k^{\prime}}^{1} \cdot}}}}}} \\ {= {2{{\pi\delta}\left( {1 - m} \right)}{\sum\limits_{k = m}^{M}\; {B_{k}^{m,j}{B_{k}^{m,i} \cdot \lambda_{j}^{m}}}}}} \end{matrix} & (37) \end{matrix}$

Since B_(k) ^(m,i) are eigenvectors of a hermitian matrix, they are orthogonal. Therefore, to summarize:

^(C)φ_(i) ^(m)(ρ,θ)·^(C)φ_(j) ^(l)(ρ,θ)

=2πδ(l−m)·δ(i−j)·λ_(i) ^(m)  (38)

Using these results, equation (10) can be rewritten as follows to arrive at equation (12):

$\begin{matrix} \begin{matrix} {{I\left( \overset{\_}{x} \right)} = {\sum\limits_{m = 0}^{M}\; {\sum\limits_{i = 1}^{n}\; {\lambda_{i}\left\{ {{{T \otimes {{}_{}^{}{}_{}^{}}}}^{2} + {{T \otimes {{}_{}^{}{}_{}^{}}}}^{2}} \right\}}}}} \\ {= {\sum\limits_{m}\; {\left( {\Psi_{m,1}^{C}\mspace{20mu} \cdots \mspace{20mu} \Psi_{m,{M - m}}^{C}} \right){\begin{pmatrix} B_{1}^{m,1} & \cdots & B_{M - m}^{m,1} \\ \vdots & ⋰ & \vdots \\ B_{1}^{m,{M - m}} & \cdots & B_{M - m}^{m,{M - n}} \end{pmatrix} \cdot}}}} \\ {{\begin{pmatrix} B_{1}^{m,1} & \cdots & B_{M - m}^{m,1} \\ \vdots & ⋰ & \vdots \\ B_{1}^{m,{M - m}} & \cdots & B_{M - m}^{m,{M - n}} \end{pmatrix}\begin{pmatrix} \Psi_{m,1}^{C} \\ \vdots \\ \Psi_{m,{M - m}}^{C} \end{pmatrix}}} \\ {= {{\sum\limits_{m = 0}^{M}\; {\sum\limits_{i = 1}^{n}\; {\Psi_{m,i}^{C}}^{2}}} + {\Psi_{m,i}^{S}}^{2}}} \end{matrix} & (39) \end{matrix}$

by the orthogonally of the eigenvectors B_(k) ^(m,i).

When the optical system is subject to aberrations, the kernel of equation (25) is modified as follows:

$\begin{matrix} \begin{matrix} {{K\left( {\rho_{1},\rho_{2},\theta_{1},\theta_{2}} \right)} = {{A\left( {\rho_{1},\theta_{1}} \right)} \cdot {A^{*}\left( {\rho_{2},\theta_{2}} \right)} \cdot}} \\ {\frac{J_{1}\left( {\sigma \sqrt{\begin{matrix} {\rho_{1}^{2} + \rho_{2}^{2} -} \\ {2\rho_{1}\rho_{2}{\cos \left( {\theta_{1} - \theta_{2}} \right)}} \end{matrix}}} \right)}{\sigma \sqrt{\begin{matrix} {\rho_{1}^{2} + \rho_{2}^{2} -} \\ {2\rho_{1}\rho_{2}{\cos \left( {\theta_{1} - \theta_{2}} \right)}} \end{matrix}}}} \\ {= {2{\sum\limits_{k = 0}^{\infty}\; {\left( {1 + k} \right){{A\left( {\rho_{1},\theta_{1}} \right)} \cdot}}}}} \\ {{{A^{*}\left( {\rho_{2},\theta_{2}} \right)} \cdot {A^{*}\left( {\rho_{2},\theta_{2}} \right)} \cdot}} \\ {{\frac{J_{1 + k}\left( {\sigma\rho}_{1} \right)}{{\sigma\rho}_{1}}\frac{J_{1 + k}\left( {\sigma\rho}_{2} \right)}{{\sigma\rho}_{2}}}} \\ {{{C_{k}^{1}\left( {\cos \left( {\theta_{1} - \theta_{2}} \right)} \right)} =}} \\ {= {2{\sum\limits_{K = 0}^{\infty}\; {\left( {1 + k} \right){{A\left( {\rho_{1},\theta_{1}} \right)} \cdot}}}}} \\ {{{A^{*}\left( {\rho_{2},\theta_{2}} \right)} \cdot \frac{J_{1 + k}\left( {\sigma\rho}_{1} \right)}{{\sigma\rho}_{1}}}} \\ {{\frac{J_{1 + k}\left( {\sigma\rho}_{2} \right)}{{\sigma\rho}_{2}}{\sum\limits_{m = 0}^{k}\; {\cos \begin{bmatrix} \left( {k - {2m}} \right) \\ \left( {\theta_{1} - \theta_{2}} \right) \end{bmatrix}}}}} \\ {= {2{\sum\limits_{k = 0}^{\infty}\; {\sum\limits_{m = 0}^{k}\; {\left( {1 + k} \right){{A\left( {\rho_{1},\theta_{1}} \right)} \cdot}}}}}} \\ {{{A^{*}\left( {\rho_{2},\theta_{2}} \right)} \cdot \frac{J_{1 + k}\left( {\sigma\rho}_{1} \right)}{{\sigma\rho}_{1}}}} \\ {{\frac{J_{1 + k}\left( {\sigma\rho}_{2} \right)}{{\sigma\rho}_{2}} \cdot \left\{ {{\cos \left\lbrack {\left( {k - {2m}} \right)\theta_{1}} \right\rbrack} \cdot} \right.}} \\ {{{\cos \left\lbrack {\left( {k - {2m}} \right)\theta_{2}} \right\rbrack} +}} \\ \left. {{\sin \left\lbrack {\left( {k - {2m}} \right)\theta_{1}} \right\rbrack} \cdot {\sin \left\lbrack {\left( {k - {2m}} \right)\theta_{2}} \right\rbrack}} \right\} \end{matrix} & \left\lbrack {{equation}\mspace{20mu} (40)} \right\rbrack \end{matrix}$

It can be seen in this equation that the radial and angular parts of the kernel remain separable, as in equation (25).

Equation (40) can be solved by multiplying both sides of the equation by

${\sqrt{1 + k} \cdot {A\left( {\rho_{1},\theta_{1}} \right)} \cdot \frac{J_{1 + k}\left( {\sigma\rho}_{1} \right)}{{\sigma\rho}_{1}}}{\cos \left\lbrack {\left( {k - {2m}} \right)\theta_{1}} \right\rbrack}$ and ${{\sqrt{1 + k} \cdot {A\left( {\rho_{1},\theta_{1}} \right)} \cdot \frac{J_{1 + k}\left( {\sigma\rho}_{1} \right)}{{\sigma\rho}_{1}}}{\sin \left\lbrack {\left( {k - {2m}} \right)\theta_{1}} \right\rbrack}},$

and then integrating, as in equation (31). The resulting eigenfunctions are given by equation (13) above. The secular matrix in this case is given by:

$\begin{matrix} {A_{n,k}^{m,1} \equiv {\int_{0}^{\infty}{\rho_{1}{\rho_{1}}\sqrt{\left( {1 + n} \right)}\sqrt{\left( {1 + k} \right)}{{{A\left( {\rho_{1},\theta_{1}} \right)}}^{2} \cdot \cdot \frac{J_{1 + n}\left( {\sigma\rho}_{1} \right)}{{\sigma\rho}_{1}} \cdot \frac{J_{1 + k}\left( {\sigma\rho}_{1} \right)}{{\sigma\rho}_{1}} \cdot {\cos \left\lbrack {\left( {k - {2l}} \right)\theta_{1}} \right\rbrack} \cdot {\cos \left\lbrack {\left( {n - {2m}} \right)\theta_{1}} \right\rbrack}}}}} & (41) \end{matrix}$

When the expansion functions of equation (16) are used, however, the aerial image density can be calculated without explicitly finding the terms of this matrix. 

1. A method for inspecting a mask, comprising: projecting radiation through the mask using an optical system so as to form an actual aerial image; deriving a transmission function from a design of the mask; generating a simulated aerial image by convolving a mathematical model of the optical system with the transmission function; and comparing the simulated aerial image to the actual aerial image in order to identify a defect in the mask.
 2. The method according to claim 1, wherein generating the simulated aerial image comprises determining a coherence characteristic of the optical system, and creating the mathematical model as a coherent decomposition of the optical system based on the coherence characteristic, the decomposition comprising a series of expansion functions having angular and radial components that are expressed as explicit functions.
 3. The method according to claim 1, wherein generating the simulated aerial image comprises defining a kernel function representing characteristics of the optical system, and creating the mathematical model as a coherent decomposition of the optical system into a series of expansion functions, which are components of eigenfunctions of the kernel function, without computing eigenvalues of the kernel function.
 4. The method according to claim 1, wherein generating the simulated aerial image comprises determining one or more aberrations of the optical system, and generating the mathematical model responsively to the aberrations.
 5. The method according to claim 4, wherein determining the one or more aberrations comprises representing a point spread function (PSF) of the optical system as a series of aberration terms having respective coefficients, and wherein generating the mathematical model comprises calculating a series of expansion functions representing the optical system responsively to the respective coefficients of the aberration terms.
 6. Apparatus for inspecting a mask, comprising: an optical system, which is adapted to project radiation through the mask so as to form an actual aerial image; and an image processor, which is coupled to receive the actual aerial image and is adapted, responsively to a transmission function derived from a design of the mask, to generate a simulated aerial image by convolving a mathematical model of the optical system with the transmission function, and to compare the simulated aerial image to the actual aerial image in order to identify a defect in the mask.
 7. The apparatus according to claim 6, wherein the optical system has a coherence characteristic, and wherein the image processor is adapted to create the mathematical model as a coherent decomposition of the optical system based on the coherence characteristic, the decomposition comprising a series of expansion functions having angular and radial components that are expressed as explicit functions.
 8. The apparatus according to claim 6, wherein the image processor is adapted, responsively to a kernel function representing characteristics of the optical system, to create the mathematical model as a coherent decomposition of the optical system into a series of expansion functions, which are components of eigenfunctions of the kernel function, without computing eigenvalues of the kernel function.
 9. The apparatus according to claim 6, wherein the image processor is adapted to receive an indication of one or more aberrations of the optical system, and to generate the mathematical model responsively to the aberrations.
 10. The apparatus according to claim 9, wherein the image processor is adapted to represent a point spread function (PSF) of the optical system as a series of aberration terms having respective coefficients, and to generate the mathematical model using a series of expansion functions representing the optical system responsively to the respective coefficients of the aberration terms.
 11. A computer software product for use in inspection of a mask, the product comprising a computer-readable medium in which program instructions are stored, which instructions, when read by a computer, cause the computer to receive an actual aerial image of the mask projected by an optical system, and responsively to a transmission function derived from a design of the mask, to generate a simulated aerial image by convolving a mathematical model of the optical system with the transmission function, and to compare the simulated aerial image to the actual aerial image in order to identify a defect in the mask. 